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1. ABSTRACT 


Coronaviruses have the potential to cause 
significant economic, agricultural and health problems. 
The severe acute respiratory syndrome (SARS) associated 
coronavirus outbreak in late 2002, early 2003 called 
attention to the potential damage that coronaviruses could 
cause in the human population. The ensuing research has 
enlightened many to the molecular biology of 
coronaviruses. A programmed -1 ribosomal frameshift is 
required by coronaviruses for the production of the RNA 
dependent RNA polymerase which in turn is essential for 
viral replication. The frameshifting signal encoded in the 
viral genome has additional features that are not essential 
for frameshifting. Elucidation of the differences between 
coronavirus frameshift signals and signals from other 
viruses may help our understanding of these features. Here 
we summarize current knowledge and add additional 
insight regarding the function of the programmed -1 
ribosomal frameshift signal in the coronavirus lifecycle. 
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2. INTRODUCTION 


The Coronaviridae family is comprised of 
toroviruses and at least three groups of coronaviruses 
which, together with the Arteriviridae and Roniviridae, 
belong to the order Nidovirales (1; 
http://www.ncbi.nlm.nih.gov/ICT Vdb/Ictv/index.htm). The 
arteriviruses and coronaviruses can cause enteric and 
respiratory tract infections in mammals and birds while the 
roniviruses infect fish. The severity of pathogenesis varies 
depending on viral genotype. Outbreaks sometimes result in 
diarrhea and loss of livestock. After the SARS (severe acute 
respiratory syndrome) associated coronavirus epidemic in 
2002-2003 interest and research in coroniviruses increased 
dramatically. Unlike the previously identified human 
coronaviruses (HCoV-229E and HCoV-OC43) which are 
more commonly associated with inconsequential respiratory 
infection, SARS-CoV had a mortality rate of 9.6% (2; 
http://www.who.int/csr/sars/country/table2004_04 21/en/in 
dex.html). The resulting resurgence in interest has added to 
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Figure 1. Coronavirus Lifecycle. Viral entry is mediated by a spike-receptor interaction and cathepsin L mediated membrane 
fusion or endocytosis (A). After endocytosis the viral genome is released (B). The first phase of translation results in the 
production of two polyproteins (C.) The polyproteins are processed (D) and form double membraned vesicles (DMV). The 
DMV is the site of both minus- and plus-strand replication (E). A second phase of translation occurs from which proteins 
encoded in the subgenomic RNAs are made (F). New genomic RNA is packaged by nucleocapid protein. The structural proteins 
are processed in the ERGIC (G). The new viral particles exit the cell via exocytosis (H). See the main text for additional detail. 


decades of work from a few groups and has 
advanced our knowledge of the coronavirus lifecycle 
considerably. This article focuses on how these viruses 
manipulate the host protein synthetic machinery to produce 
proteins essential for viral replication via a mechanism 
called programmed -1 ribosomal frameshifting (-1 PRF). 
We describe salient features and how study of the 
frameshifting mechanism may enhance our understanding 
of other parts of the viral lifecycle. 


3. Coronavirus Lifecycle 


Successful infection of host cells requires many 
steps which are common to all coronaviruses (Figure 1). 
Typically the coronavirus spike (S) glycoprotein and/or the 
haemagglutinin proteins interact with a cellular receptor to 
mediate entry into the host cell. Coronaviruses can also 
enter the cell via endocytosis. While many group 1 
coronaviruses interact with an aminopeptidase, the SARS 
coronavirus uses the angiotensin converting enzyme 2 
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(ACE2) and/or the C-type lectins DC/L-SIGN as the host 
receptors (reviewed in 3). Cathepsin L cleaves the S 
protein and the viral envelope fuses with the host cell 
membrane. The virus disassembles, releasing the genomic 
RNA after which a replication-transcription complex forms 
on double membraned vesicles (4 and references within). 
New genomic and subgenomic RNA (sgRNA) is produced 
by the unique mechanism of discontinuous transcription 
during which negative-strand RNA intermediates are 
produced (reviewed in 5). Structural and accessory 
proteins are translated from the plus-strand sgRNA (Figure 
2). Nucleocapsid proteins (N) package the genomic RNA 
and are met by envelope proteins which accumulate in the 
ER-to-Golgi intermediate compartment (ERGIC) for 
assembly. After virus particles assemble they egress from 
the cell via exocytosis. 


Viruses by definition are dependent on the host 
cellular machinery for replication. However, as the viral 
lifecycle progresses inside the cell, the virus not only 
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Figure 2. SARS Coronavirus Genome Organization. (A) There are nine major open reading frames in the SARS-CoV genome. 
The first is divided into two overlapping parts, ORFla and ORF1b, and comprises two thirds of the genome. In addition to the 
full length genomic RNA, 5’ nested subgenomic RNAs are present in infected cells. The sgRNAs contain the same 5’ non- 
coding region (bold line) as the full length genomic RNA. Two polyproteins are produced from the full length RNA (shaded 
boxes). They are subsequently processed into functional units, non-structural proteins nsp1 through nsp16. The structural and 
accessory proteins are translated from the sgRNAs (shaded boxes). See text for additional detail. 


usurps certain host enzymes, but it also generates proteins sgRNAs (reviewed in 5). The ORFs encoding structural 
that are not available from the host repertoire. For and accessory proteins are translated from sgRNAs. The 
example, translation of viral proteins from the initial nonstructural proteins remain with the replication complex 
infectious RNA _ utilizes the host ribosomes while on the DMV while the structural proteins migrate for 
generation of new virus RNA requires a virally encoded assembly into viral particles (7). 
RNA dependent RNA polymerase (RDRP). Many 
coronavirus proteins are translated from subgenomic RNAs Production of proteins from ORFla/b does not 
(sgRNAs) rather than the genomic RNA (Figure 2). One follow the usual rules of translation. Two polyproteins are 
result of this is that translation of plus-strand viral message produced during the translation of one disjointed open 
RNA into proteins must theoretically occur in at least two reading frame. The first polyprotein is encoded entirely 
phases: first the genomic RNA serves as a template for within ORF 1a and translation terminates at the stop codon 
production of nonstructural proteins including the RDRP; that defines ORFla, as is typical in normal translation. 
then the RDRP uses the genomic RNA as a template for the However, there are signals contained within the RNA prior 
production of sgRNAs (Figure 1). The structural and to the stop codon that direct a fraction of elongating 
accessory proteins are produced from the sgRNAs in the ribosomes into an alternative reading frame, allowing them 
next phase of translation. Thus, the second phase of to bypass the ORFla termination codon and continue 
translation cannot occur without the production of enzymes translation into ORF 1b, creating a larger polyprotein. This 
from the first phase. It is not known if successful infection redirection of the ribosomes to create two polyproteins has 
requires the presence of more than one copy of the genomic been demonstrated for many viruses including arteriviruses 
RNA. (8), roniviruses (9) and a number of coronaviruses (10-13). 
The mechanism by which the ribosomes are redirected is 
ORF la/b, which is translated in the first phase, is called programmed -1 ribosomal frameshifting (-1 PRF). It 
a polyprotein that is cleaved into 16 non-structural proteins is often at least 2-orders of magnitude more efficient than 
(Figure 2). The nonstructural domains in ORF 1a (nsp1-11) baseline rates ribosomal error. The efficiency of a -1 PRF 
and ORF 1b (nsp12-16) are defined by proteolytic cleavage may range from 15-60% depending on the assay system 
sites (reviewed in 6). The functional domains suggest that and the amount of RNA flanking the core sequence (14- 
most of these proteins are involved in proteolytic cleavage 16). This suggests that the flanking sequences are of some 
and production or modification of RNA. The nonstructural importance; however, codon and reading frame constraints 
proteins form replication complexes on the double pose some limitations on analyses of these flanking 
membraned vesicles (DMV). During RNA replication sequences using current in vitro assay systems. Until the 
negative strand copies of the genomic RNA are made along relatively recent emergence of a variety of molecular and 
with negative strand sgRNAs. These in turn serve as viral tools specific to coronaviruses the pursuit of these 
templates for the production of positive strand genomic and issues has been limited. The following section describes 
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Programmed -1 Ribosomal Frameshifting. During translation features intrinsic to the mRNA being decoded 


manipulate the ribosome such that the reading frame is altered. (A) The ribosome pauses over a heptameric slippery site 
(UUUAAAC in coronaviruses). The aminoacyl-tRNA (with the filled circle) and the peptidyl-tRNA (with open circles) are 
positioned in the zero frame in the A- and P-sites of the ribosome. The pause and curbing of ribosome fidelity is stimulated by a 
3’ RNA structure. (B) Both peptidyl- and aminoacyl-tRNAs un-pair from the mRNA and re-pair in the -1 reading frame. This is 
facilitated by the anticodons being able to base pair at the non-wobble positions in the new reading frame. The pseudoknot 
structure is resolved and translation continues in the new reading frame. 


how our understanding of -1 PRF has advanced with 
particular emphasis on the many contributions made by 
analysis of coronavirus frameshift signals. 


4. Programmed Frameshifting 


Programmed -1 ribosomal frameshifting (-1 PRF) 
is a mechanism used to regulate gene expression at the 
level of protein synthesis. As ribosomes translate one ORF 
they encounter a signal in the mRNA that directs a fraction 
of them to shift into an alternative downstream ORF which 
is in the -1 phase relative to the initiating upstream ORF 
(Figure 3). In viruses -1 PRF usually results in a C- 
terminally extended polyprotein containing additional 
function not present in the upstream ORF. The use of a -1 
PRF mechanism for expression of a viral gene was first 
published in 1985 for the Rous sarcoma virus (17) and 
subsequently for other retroviruses (18). The first complete 
coronavirus sequence was published in 1987 (IBV; 19) and 
later that same an in vitro translation system was used to 
demonstrate that a -1 PRF mechanism was used to translate 
ORFlab (10). In subsequent years, the IBV frameshift 
signal has been extensively analyzed by the Brierley and 
co-workers to become one of the most well characterized - 
1 PRF signals. 


-1 PRF signals are usually composed of a 
“slippery site” followed by a stimulatory structure. These 
two elements are typically separated by a short spacer 
region. The slippery site is composed of a heptameric 
sequence such that the A- and P-site tRNAs can un-pair 
from the mRNA and re-pair in the -1 reading frame (20; 
Figure 3). The nucleotides surrounding the heptameric 
slippery site have been shown to have a limited effect on 
frameshifting efficiencies. Experiments altering the spacer 
region between the slippery site and stimulatory element 
reduced frameshifting efficiency suggesting that there 
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might be some optimal spacer sequence (27-30). The three 
nucleotides 5’ of the heptameric sequence also affect -1 
PRF efficiency suggesting a role for the exiting tRNA in 
the ribosomal E-site (24, 31). The stimulatory element has 
been shown to contribute significantly to -1 PRF 
efficiencies. 


While the stimulatory structure was initially 
postulated to be a simple mRNA stem-loop studies of 
the IBV -1 PRF signal provided the first evidence for 
the requirement of a more complex mRNA pseudoknot 
(27; Figure 4). Subsequently mRNA pseudoknots were 
identified in the frameshift signals of a wide variety of 
plant and animal viruses. As additional viral sequences 
became available more elaborate stimulatory structures 
were identified in coronaviruses. These include “kissing 
loops” (32), and three stemmed mRNA pseudoknots, 
which were predicted for the coronavirus and the related 
torovirus Berne virus (33-34), and _ subsequently 
demonstrated by nuclease mapping for the SARS 
coronavirus (15-16). The variation in these stimulatory 
elements suggests that the additional features might be 
required for fine-tuning frameshifting efficiency or, 
alternatively, involved in additional viral functions. 
Interestingly, efficient frameshifting was observed when 
the third stem was deleted from the SARS-CoV 
pseudoknot, or when a similar region was deleted from 
the IBV stimulatory structure, suggesting that these 
regions are not required to modulate -1 PRF (15, 35). 
However, it is clear from mutational analyses that 
when the third stem is present that it has an effect on 
-1 PRF (14-15). Furthermore, additional sequence 
upstream of the core frameshift signal has been 
shown to affect -1 PRF efficiency in SARS-CoV (16). 
Thus, although core essential elements of the 
frameshift signal have been defined, the scope of factors, 
either cis- or trans-acting, has not yet been revealed. 
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Figure 4. Coronavirus Frameshift Signals. 
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Two to three RNA stems following a heptameric slippery site comprise the 


coronavirus frameshift signal. The slippery site is underlined. The proposed IBV, SARS-CoV and HCoV-229E pseudoknot 
structures are shown (A to C respectively). The IBV structure has two stems, the SARS-CoV structure contains an additional 
internal stem, and the HCoV-229E structure is formed by two ‘kissing’ stem-loops. See text for additional detail. 


A number of models have been proposed to 
describe the mechanism by which -1 PRF occurs (20-24). 
All the models posit that the stimulatory element causes a 
pause in translation and that base-pairing is required at the 
non-wobble positions of at least two tRNA molecules to the 
mRNA after the frameshift (Figure 3). Differences among 
the models are centered on the timing of the frameshift 
within the context of the elongation cycle. The detection of 
two different frameshift products by protein sequencing 
(18, 20) suggests that the different models may not be 
mutually exclusive. Analysis of frameshifting is 
complicated somewhat by the availability of malleable 
experimental systems imitating the appropriate host cell. It 
has been shown that prokaryotic ribosomes decipher 
coronavirus frameshift signals quite differently from yeast, 
plant or mammalian ribosomes (25-26). Thus a suitable 
system must be used to draw purposeful conclusions from 
in vitro analyses of -1 PRF. The prevalence of 
coronaviruses and their spread among a wide range of 
mammals including bats (1) suggests that analyses in 
mammalian cells are appropriate in most instances. 


5. WHAT IS THE ROLE OF -1 PRF IN 
CORONAVIRUS LIFECYCLE? 


THE 


While progress has been made on elucidating the 
mechanism of -1 programmed ribosomal frameshifting and 
the RNA _ sequences involved in coronaviruses, the 
requirement for -1 PRF in the lifecycle of this class of 
viruses remains obscure. For other viruses, such as HIV 
and the yeast totivirus L-A, frameshifting regulates the 
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relative ratios of structural to enzymatic proteins. The 
relative abundance of the coat proteins to viral RNA affects 
packaging and deviations from the optimal ratio result in a 
loss of infectivity (36-38). In contrast, -1PRF in 
coronaviruses modulates the relative ratios of two different 
classes of enzymatic proteins: proteases (and other 
uncharacterized proteins) encoded by the upstream Orfla, 
and RDRPs and RNA modifying enzymes encoded by 
Orflb. As coronavirus structural proteins are encoded on 
sgRNAs (the transcription of which is dependent on the 
frameshift), the role of -1 PRF on both the levels and 
timing of their synthesis, and on virus propagation in 
general has not yet been characterized. More specifically, 
the functional domains of the predicted proteases are 
encoded in nsp3 and nsp5 within ORF la, prior to the -1 
PRF site. The RNA modifying functions (RNA dependent 
RNA polymerase, helicase, exoribonuclease, uridylate- 
specific endoribonuclease and  S-adenoslmethionine- 
dependent ribose 2’-O-methyltransferase) are encoded in 
nsp12-16 after the -1 PRF site (Figure 2). The reason for 
regulating the abundance of these proteins relative to one 
another is unknown. Nsp8 was recently described as a 
second RDRP raising the possibility that nsp8 and nsp12 
ratios are important for controlling the amount of different 
RNA transcripts made during replication (39). The 
mechanisms by which a 100-fold more plus-strand RNA is 
made relative to the negative strand, or the mechanism that 
directs production of sgRNA rather than genomic RNA to 
be produced are not known (5). While it is possible that the 
ratio of nsp8 to nsp12 protein products may affect one of 
these mechanisms, this suggestion does not take into 
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account the relative ratios of the other 14 proteins encoded 
in ORF la/b. 


The genomic RNA from plus-strand RNA viruses 
serves at least two functions during infection: 1) it acts as 
the mRNA from which viral proteins are translated, and 2) 
it is the template from which new genomic and subgenomic 
RNA is transcribed. How the infectious RNA transitions 
from one function to the other remains unanswered. Some 
progress has been made in our understanding of how 
another plus-strand RNA virus, the Barley yellow dwarf 
virus (BYDV) regulates ribosome and replicase traffic on 
an RNA template (40). Unlike the SARS coronavirus 
which utilizes a pseudoknot for frameshifting, BYDV 
requires a kissing-loop interaction similar to that described 
for the human coronavirus 229E frameshifting (32). The 
model requires disruption of long range RNA:RNA 
interactions which, if they do not reform, allow a switch 
between translation and transcription. In BWYYV, interactions 
between a sgRNA and the genomic RNA also inhibit 
translation of the genomic RNA leaving it available for 
transcription or packaging (41). Such long range RNA:RNA 
interactions or interactions between gRNA and sgRNAs have 
not yet been identified in coronaviruses. An RNA switch in 
the 3’ UTR of MHV has been characterized and found to be 
essential (42). This motif is found in all group 2 coronavirus 
sequences but only the 5’ or 3’ portion appears to be conserved 
in group | or group 3 coronaviruses. Interestingly a third stem- 
loop in the pseudoknot of the -1 PRF signal is predicted to be 
conserved among the group 2 coronavirus but not in the group 
1 coronaviruses which utilize a kissing-loop for frameshifting 
(15). Some alterations to the third stem in the SARS 
coronavirus pseudoknot result in a loss of infectivity without 
dramatically affecting frameshifting, and a subset of viral 
proteins encoded by the subgenomic RNAs have been 
identified that bind to the pseudoknot in the SARS -1 PRF 
signal (Plant and Dinman, unpublished data). These 
findings suggest that this region of the SARS (+) strand is 
vital for an aspect of the virus lifecycle other than -1 PRF. 


One current research challenge lies in producing 
mutations having only moderate effects on -1 PRF so that 
more meaningful virology can be pursued. As these mutant 
viruses and replicons become available we will be able to 
correlate the efficiency of frameshifting with production of 
genomic and subgenomic RNAs, and with viral titers. It is 
expected that some of these mutations will result in defects 
that will give insight into the function of the internal stem 
loop (stem 3) of the frameshift signal, and that that the 
insight thus gained will provide an alternative starting point 
for dissecting the coronavirus replication system. 


6. PERSPECTIVES 


The prospects for studies of both -1 PRF and 
coronaviruses are encouraging. A number of synergistic 
advances are being made in both areas. The constantly 
expanding number of coronavirus sequences is enhancing 
the ability of researchers to identify conserved RNA 
sequence and structural motifs with a greater degree of 
confidence. As critical elements are identified experiments 
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can be thoughtfully designed to generate mutant viruses 
and replicons from which useful information can be 
acquired. The size of the coronavirus genome and the lack 
of unique restriction sites for cloning pose some difficulties 
in manipulation of the virus. However recent advances in 
the stability of plasmid vectors and the availability of class 
II restriction endonucleases or restriction enzymes that 
cleave adjacent to the recognition sequence have 
circumvented some of these difficulties (43). The 
available clones and replicons (44-45) have allowed 
many groups to readily investigate different aspects of 
the coronavirus lifecycle. Developments in the NMR 
field are also enabling the solution of larger RNA 
structures such as those that direct -1 PRF. In addition, 
advances are being made in the design of algorithms 
able to predict tertiary RNA _ structures such as 
frameshift-promoting mRNA pseudoknots (46-48). As 
more structural data are generated the computational 
algorithms will be refined which in turn will provide 
enhanced tools for experimental design by bench-based 
researchers. 


The coronavirus frameshift signals are complex 
and diverse and, as described above, have yielded a wealth 
of molecular biological data describing features important 
for -1 PRF. As noted previously, there are several 
limitations to -1 PRF studies, the most pertinent being that 
there are two overlapping open reading frames to maintain. 
Obviously silent protein coding mutations are preferable so 
that only recoding events are analyzed rather than protein 
function. The termination codon for the first ORF is very 
early in the SARS -1 PRF signal compared to the position 
of stop codons in other frameshift signals and this has 
increased the variety of mutations that can be sustained. A 
further limitation is that a frameshifting event must occur 
for production of the RNA dependent RNA polymerase 
which is essential for virus production. Thus mutations 
which abolish frameshifting completely will not produce 
replicative or infectious virus. It has been shown for some 
other viruses that there is an apparent threshold level of 
frameshifting required for competent virus production (36- 
38). Mutations that subtly alter the frequency of -1PRF in 
the coronavirus context are being discovered and these will 
lead to a greater understanding of the role of -1PRF in 
coronavirus replication. 
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Abbreviations: SARS: severe acute respiratory syndrome; 
SARS-CoV: SARS coronavirus; S: Spike protein; gRNA: 
genomic RNA; sgRNA: subgenomic RNA; RDRP: RNA 
dependent RNA polymerase; ORF: open reading frame; 
nsp: non-structural protein; DMV: double membraned 
vesicle; -1 PRF: programmed -1 ribosomal frameshifting; 
BYDV: Barley yellow dwarf virus. 
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